{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# DSPy-Clarifai lm and retriever example notebook\n",
    "\n",
    "This notebook will walk you through on the integration of clarifai into DSPy which enables the DSPy users to leverage clarifai capabilities of calling llm models from clarifai platform and to utilize clarifai app as retriever for their vector search use cases."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install clarifai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install dspy-ai"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Import necessary packages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "import dspy\n",
    "from dspy.retrieve.clarifai_rm import ClarifaiRM "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Initialize clarifai app id, user id and PAT.\n",
    "Create an AI app in clarifai portal in < 1 min by following this link [getting started](https://docs.clarifai.com/clarifai-basics/quick-start/your-first-predictions).\n",
    "\n",
    "You can browse the portal to obtain [MODEL URL](https://clarifai.com/explore/models) for different models in clarifai community."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "#for the demo we are going with llama2-70b-chat\n",
    "MODEL_URL = \"https://clarifai.com/meta/Llama-2/models/llama2-70b-chat\" \n",
    "PAT = \"CLARIFAI_PAT\"\n",
    "USER_ID = \"YOUR_ID\"\n",
    "APP_ID = \"YOUR_APP\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data ingestion into clarifai vectordatabase\n",
    "\n",
    "To use clarifai as retriever all you have to do is ingest the documents into clarifai app that serves as your vectordatabase to retrieve similar documents.\n",
    "To simplify the ingestion, we are utilising the clarifaivectordatabase integration for ingestion."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#run this block to ingest the documents into clarifai app as chunks.\n",
    "# if you encounter any issue, make sure to run `pip install langchain`\n",
    "\n",
    "from langchain.text_splitter import CharacterTextSplitter\n",
    "from langchain.document_loaders import TextLoader\n",
    "from langchain.vectorstores import Clarifai as clarifaivectorstore\n",
    "\n",
    "loader = TextLoader(\"YOUR_TEXT_FILE\") #replace with your file path\n",
    "documents = loader.load()\n",
    "text_splitter = CharacterTextSplitter(chunk_size=1024, chunk_overlap=200)\n",
    "docs = text_splitter.split_documents(documents)\n",
    "\n",
    "clarifai_vector_db = clarifaivectorstore.from_documents(\n",
    "    user_id=USER_ID,\n",
    "    app_id=APP_ID,\n",
    "    documents=docs,\n",
    "    pat=PAT\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Initialize LLM class"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Make sure to pass all the model parameters in inference_params field of clarifaiLLM class. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "llm=dspy.Clarifai(model=MODEL_URL, api_key=PAT, n=2, inference_params={\"max_tokens\":100,'temperature':0.6})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Initialize Clarifai Retriever model class\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "retriever_model=ClarifaiRM(clarifai_user_id=USER_ID, clarfiai_app_id=APP_ID, clarifai_pat=PAT, k=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "configure dspy with llm and rm models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [],
   "source": [
    "dspy.settings.configure(lm=llm, rm=retriever_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example: dspy.signature and dspy.module with clairfaiLLM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NEGATIVE\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "sentence = \"disney again ransacks its archives for a quick-buck sequel .\"  # example from the SST-2 dataset.\n",
    "\n",
    "classify = dspy.Predict('sentence -> sentiment')\n",
    "print(classify(sentence=sentence).sentiment)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example: Quick glimpse into how our retriever works when a query is passed to the dspy.Retrieve class\n",
    "\n",
    "Here we have used the rulebook of Formula student Germany competition.\n",
    "\n",
    "link : https://www.formulastudent.de/fileadmin/user_upload/all/2024/rules/FS-Rules_2024_v1.0.pdf \n",
    "\n",
    "We have used the .txt version of the file for our demo."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
    "retrieve = dspy.Retrieve()\n",
    "topK_passages = retrieve(\"can I test my vehicle engine in pit?\").passages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['A 6.8.3\\n\\nCranking engines in the pits is allowed, when the following conditions are met:\\n• The vehicle has passed mechanical inspection.\\n• The driven axles are securely jacked up.\\n• Gearbox is in neutral.\\n• All driven wheels are removed.\\n• Connectors to all injectors and ignition coils are detached.\\n• A fire extinguisher must be placed next to the engine.\\n\\nA 6.9\\n\\nFueling and Oil\\n\\nA 6.9.1\\n\\nFueling may only take place at the fuel station and must be conducted by officials only.\\n\\nA 6.9.2\\n\\nOpen fuel containers are not permitted at the competition.\\n\\nA 6.9.3\\n\\nWaste oil must be taken to the fuel station for disposal.\\n\\nA 6.10\\n\\n[EV ONLY ] Working on the Vehicle\\n\\nA 6.10.1\\n\\nAll activities require the TSAL to be green.\\n\\nA 6.10.2\\n\\nA prominent manual sign indicating the “TSAL green” state must be present whenever the\\nLVS is switched off and the requirements for an only green TSAL according to EV 4.10 are\\nmet.\\n\\nA 6.10.3', 'A 6.8.3\\n\\nCranking engines in the pits is allowed, when the following conditions are met:\\n• The vehicle has passed mechanical inspection.\\n• The driven axles are securely jacked up.\\n• Gearbox is in neutral.\\n• All driven wheels are removed.\\n• Connectors to all injectors and ignition coils are detached.\\n• A fire extinguisher must be placed next to the engine.\\n\\nA 6.9\\n\\nFueling and Oil\\n\\nA 6.9.1\\n\\nFueling may only take place at the fuel station and must be conducted by officials only.\\n\\nA 6.9.2\\n\\nOpen fuel containers are not permitted at the competition.\\n\\nA 6.9.3\\n\\nWaste oil must be taken to the fuel station for disposal.\\n\\nA 6.10\\n\\n[EV ONLY ] Working on the Vehicle\\n\\nA 6.10.1\\n\\nAll activities require the TSAL to be green.\\n\\nA 6.10.2\\n\\nA prominent manual sign indicating the “TSAL green” state must be present whenever the\\nLVS is switched off and the requirements for an only green TSAL according to EV 4.10 are\\nmet.\\n\\nA 6.10.3']\n"
     ]
    }
   ],
   "source": [
    "print(topK_passages)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## RAG dspy module using clarifai as retriever"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Generally to construct a module in dspy, you might need to define \n",
    "\n",
    "Signature: \n",
    "explain the input and output fields in an intuitive way with just few words.\n",
    "(\"question\"-> \"answer\")\n",
    "\n",
    "Module:\n",
    "Module can be something where you put the signatures into action by defining a certain module which compiles and generate response for you for the given query."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Construct a signaturre class, which defines the input fields and output fields needed. \n",
    "Also, give docstrings and description in verbose, so that the dspy signature could understand the context and compile best prompt for the usecase."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [],
   "source": [
    "class GenerateAnswer(dspy.Signature):\n",
    "    \"\"\"Think and Answer questions based on the context provided.\"\"\"\n",
    "\n",
    "    context = dspy.InputField(desc=\"may contain relevant facts about user query\")\n",
    "    question = dspy.InputField(desc=\"User query\")\n",
    "    answer = dspy.OutputField(desc=\"Answer in one or two lines\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define the module with the actions needs to be performed, here we are showing a small RAG use case where we are retrieving similar contexts using our retriever class and generating response based on the factual context using one of the DSPy module `ChainOfThought`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [],
   "source": [
    "class RAG(dspy.Module):\n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "\n",
    "        self.retrieve = dspy.Retrieve()\n",
    "        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)\n",
    "    \n",
    "    def forward(self, question):\n",
    "        context = self.retrieve(question).passages\n",
    "        prediction = self.generate_answer(context=context, question=question)\n",
    "        return dspy.Prediction(context=context, answer=prediction.answer)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we are passing our query and retrieving relevant chunks using clarifai retriever and based on factual evidence, model is able to generate response."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Question: can I test my vehicle engine in pit before inspection?\n",
      "Predicted Answer: No, you cannot test your vehicle engine in the pit before inspection.\n",
      "Retrieved Contexts (truncated): ['A 6.8.3\\n\\nCranking engines in the pits is allowed, when the following conditions are met:\\n• The vehicle has passed mechanical inspection.\\n• The driven axles are securely jacked up.\\n• Gearbox is in neut...', 'A 6.8.3\\n\\nCranking engines in the pits is allowed, when the following conditions are met:\\n• The vehicle has passed mechanical inspection.\\n• The driven axles are securely jacked up.\\n• Gearbox is in neut...']\n"
     ]
    }
   ],
   "source": [
    "# Ask any question you like to this RAG program.\n",
    "my_question = \"can I test my vehicle engine in pit before inspection?\"\n",
    "\n",
    "# Get the prediction. This contains `pred.context` and `pred.answer`.\n",
    "Rag_obj= RAG()\n",
    "predict_response_llama70b=Rag_obj(my_question)\n",
    "\n",
    "# Print the contexts and the answer.\n",
    "print(f\"Question: {my_question}\")\n",
    "print(f\"Predicted Answer: {predict_response_llama70b.answer}\")\n",
    "print(f\"Retrieved Contexts (truncated): {[c[:200] + '...' for c in predict_response_llama70b.context]}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Now we will compare our RAG DSPy module with different community models from clarifai and comapare responses"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Mistral-7b Instruct"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [],
   "source": [
    "mistral_lm = dspy.Clarifai(model=\"https://clarifai.com/mistralai/completion/models/mistral-7B-Instruct\", api_key=PAT, n=2, inference_params={'temperature':0.6})\n",
    "dspy.settings.configure(lm=mistral_lm, rm=retriever_model)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Question: can I test my vehicle engine in pit before inspection?\n",
      "Predicted Answer: Reasoning: According to the context, cranking engines in the pits is allowed only when the vehicle has passed mechanical inspection.\n",
      "\n",
      "Answer: No, you cannot test your vehicle engine in pit before inspection.\n",
      "Retrieved Contexts (truncated): ['A 6.8.3\\n\\nCranking engines in the pits is allowed, when the following conditions are met:\\n• The vehicle has passed mechanical inspection.\\n• The driven axles are securely jacked up.\\n• Gearbox is in neut...', 'A 6.8.3\\n\\nCranking engines in the pits is allowed, when the following conditions are met:\\n• The vehicle has passed mechanical inspection.\\n• The driven axles are securely jacked up.\\n• Gearbox is in neut...']\n"
     ]
    }
   ],
   "source": [
    "my_question = \"can I test my vehicle engine in pit before inspection?\"\n",
    "Rag_obj= RAG()\n",
    "predict_response_mistral=Rag_obj(my_question)\n",
    "\n",
    "print(f\"Question: {my_question}\")\n",
    "print(f\"Predicted Answer: {predict_response_mistral.answer}\")\n",
    "print(f\"Retrieved Contexts (truncated): {[c[:200] + '...' for c in predict_response_mistral.context]}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Gemini Pro"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "gemini_lm = dspy.Clarifai(model=\"https://clarifai.com/gcp/generate/models/gemini-pro\", api_key=PAT, n=2)\n",
    "dspy.settings.configure(lm=gemini_lm, rm=retriever_model)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Question: can I test my vehicle engine in pit before inspection?\n",
      "Predicted Answer: No, you can't test your vehicle engine in the pits before inspection.\n",
      "Retrieved Contexts (truncated): ['A 6.8.3\\n\\nCranking engines in the pits is allowed, when the following conditions are met:\\n• The vehicle has passed mechanical inspection.\\n• The driven axles are securely jacked up.\\n• Gearbox is in neut...', 'A 6.8.3\\n\\nCranking engines in the pits is allowed, when the following conditions are met:\\n• The vehicle has passed mechanical inspection.\\n• The driven axles are securely jacked up.\\n• Gearbox is in neut...']\n"
     ]
    }
   ],
   "source": [
    "my_question = \"can I test my vehicle engine in pit before inspection?\"\n",
    "Rag_obj= RAG()\n",
    "predict_response_gemini=Rag_obj(my_question)\n",
    "\n",
    "print(f\"Question: {my_question}\")\n",
    "print(f\"Predicted Answer: {predict_response_gemini.answer}\")\n",
    "print(f\"Retrieved Contexts (truncated): {[c[:200] + '...' for c in predict_response_gemini.context]}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Clarifai enables you to test your dspy module with different llm models and compare the response as it is crucial part of prompt engineering to test and achieve the right combination of llm models with right prompt."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
